<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML
><HEAD
><TITLE
>GIN Tips and Tricks</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
REV="MADE"
HREF="mailto:pgsql-docs@postgresql.org"><LINK
REL="HOME"
TITLE="PostgreSQL 9.1.2 Documentation"
HREF="index.html"><LINK
REL="UP"
TITLE="GIN Indexes"
HREF="gin.html"><LINK
REL="PREVIOUS"
TITLE="Implementation"
HREF="gin-implementation.html"><LINK
REL="NEXT"
TITLE="Limitations"
HREF="gin-limit.html"><LINK
REL="STYLESHEET"
TYPE="text/css"
HREF="stylesheet.css"><META
HTTP-EQUIV="Content-Type"
CONTENT="text/html; charset=ISO-8859-1"><META
NAME="creation"
CONTENT="2011-12-01T22:07:59"></HEAD
><BODY
CLASS="SECT1"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="5"
ALIGN="center"
VALIGN="bottom"
><A
HREF="index.html"
>PostgreSQL 9.1.2 Documentation</A
></TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="top"
><A
TITLE="Implementation"
HREF="gin-implementation.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="top"
><A
HREF="gin.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="60%"
ALIGN="center"
VALIGN="bottom"
>Chapter 54. GIN Indexes</TD
><TD
WIDTH="20%"
ALIGN="right"
VALIGN="top"
><A
TITLE="Limitations"
HREF="gin-limit.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="SECT1"
><H1
CLASS="SECT1"
><A
NAME="GIN-TIPS"
>54.4. GIN Tips and Tricks</A
></H1
><P
></P
><DIV
CLASS="VARIABLELIST"
><DL
><DT
>Create vs. insert</DT
><DD
><P
>     Insertion into a <ACRONYM
CLASS="ACRONYM"
>GIN</ACRONYM
> index can be slow
     due to the likelihood of many keys being inserted for each item.
     So, for bulk insertions into a table it is advisable to drop the GIN
     index and recreate it after finishing bulk insertion.
    </P
><P
>     As of <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
> 8.4, this advice is less
     necessary since delayed indexing is used (see <A
HREF="gin-implementation.html#GIN-FAST-UPDATE"
>Section 54.3.1</A
> for details).  But for very large updates
     it may still be best to drop and recreate the index.
    </P
></DD
><DT
><A
HREF="runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM"
>maintenance_work_mem</A
></DT
><DD
><P
>     Build time for a <ACRONYM
CLASS="ACRONYM"
>GIN</ACRONYM
> index is very sensitive to
     the <TT
CLASS="VARNAME"
>maintenance_work_mem</TT
> setting; it doesn't pay to
     skimp on work memory during index creation.
    </P
></DD
><DT
><A
HREF="runtime-config-resource.html#GUC-WORK-MEM"
>work_mem</A
></DT
><DD
><P
>     During a series of insertions into an existing <ACRONYM
CLASS="ACRONYM"
>GIN</ACRONYM
>
     index that has <TT
CLASS="LITERAL"
>FASTUPDATE</TT
> enabled, the system will clean up
     the pending-entry list whenever the list grows larger than
     <TT
CLASS="VARNAME"
>work_mem</TT
>.  To avoid fluctuations in observed response time,
     it's desirable to have pending-list cleanup occur in the background
     (i.e., via autovacuum).  Foreground cleanup operations can be avoided by
     increasing <TT
CLASS="VARNAME"
>work_mem</TT
> or making autovacuum more aggressive.
     However, enlarging <TT
CLASS="VARNAME"
>work_mem</TT
> means that if a foreground
     cleanup does occur, it will take even longer.
    </P
></DD
><DT
><A
HREF="runtime-config-client.html#GUC-GIN-FUZZY-SEARCH-LIMIT"
>gin_fuzzy_search_limit</A
></DT
><DD
><P
>     The primary goal of developing <ACRONYM
CLASS="ACRONYM"
>GIN</ACRONYM
> indexes was
     to create support for highly scalable full-text search in
     <SPAN
CLASS="PRODUCTNAME"
>PostgreSQL</SPAN
>, and there are often situations when
     a full-text search returns a very large set of results.  Moreover, this
     often happens when the query contains very frequent words, so that the
     large result set is not even useful.  Since reading many
     tuples from the disk and sorting them could take a lot of time, this is
     unacceptable for production.  (Note that the index search itself is very
     fast.)
    </P
><P
>     To facilitate controlled execution of such queries,
     <ACRONYM
CLASS="ACRONYM"
>GIN</ACRONYM
> has a configurable soft upper limit on the
     number of rows returned: the
     <TT
CLASS="VARNAME"
>gin_fuzzy_search_limit</TT
> configuration parameter.
     It is set to 0 (meaning no limit) by default.
     If a non-zero limit is set, then the returned set is a subset of
     the whole result set, chosen at random.
    </P
><P
>     <SPAN
CLASS="QUOTE"
>"Soft"</SPAN
> means that the actual number of returned results
     could differ somewhat from the specified limit, depending on the query
     and the quality of the system's random number generator.
    </P
><P
>     From experience, values in the thousands (e.g., 5000 &mdash; 20000)
     work well.
    </P
></DD
></DL
></DIV
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="gin-implementation.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="gin-limit.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>Implementation</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="gin.html"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Limitations</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>